
DeepSeek ยท Chat / LLM ยท 671B Parameters (37B Active) ยท 128K Context

Streaming Reasoning Chain-of-Thought Code JSON Output Long ContextOverview
DeepSeek R1-0528 is the May 2025 update to the original DeepSeek-R1 โ built on the DeepSeek-V3 backbone with 671B total parameters and 37B active per inference pass via Sparse MoE. It achieves performance on par with OpenAI o1, with key improvements including 87.5% on AIME 2025 (up from 70%), reduced hallucinations, enhanced front-end capabilities, and newly added JSON output and function calling support. With chain-of-thought reasoning traces and MIT licensing, R1-0528 is one of the most capable open-source reasoning models available today โ served instantly via the Qubrid AI Serverless API.๐ง 671B total / 37B active โ frontier reasoning at MoE efficiency. Deploy via Qubrid AI with no infrastructure required.
Model Specifications
| Field | Details |
|---|---|
| Model ID | deepseek-ai/DeepSeek-R1-0528 |
| Provider | DeepSeek |
| Kind | Chat / LLM |
| Architecture | DeepSeek-V3 backbone โ Sparse MoE with 671B total / 37B active, MLA attention, MTP speculative decoding |
| Parameters | 671B total (37B active per inference pass) |
| Context Length | 128,000 Tokens |
| MoE | No |
| Release Date | May 2025 |
| License | MIT |
| Training Data | Large-scale diverse dataset; post-trained with RL (GRPO) for enhanced reasoning depth |
| Function Calling | Not Supported |
| Image Support | N/A |
| Serverless API | Available |
| Fine-tuning | Coming Soon |
| On-demand | Coming Soon |
| State | ๐ข Ready |
Pricing
๐ณ Access via the Qubrid AI Serverless API with pay-per-token pricing. No infrastructure management required.
| Token Type | Price per 1M Tokens |
|---|---|
| Input Tokens | $0.90 |
| Input Tokens (Cached) | $0.28 |
| Output Tokens | $3.20 |
Quickstart
Prerequisites
- Create a free account at platform.qubrid.com
- Generate your API key from the API Keys section
- Replace
QUBRID_API_KEYin the code below with your actual key
โ ๏ธ Temperature note: Keep temperature in the 0.5โ0.7 range (default 0.6) to prevent repetitive outputs. Values outside this range may degrade reasoning quality.
Python
JavaScript
Go
cURL
Live Example
Prompt: Explain quantum computing in simple terms
Response:
Playground Features
The Qubrid AI Playground lets you interact with DeepSeek R1-0528 directly in your browser โ no setup, no code, no cost to explore.๐ง System Prompt
Define the modelโs role, output format, and reasoning constraints before the conversation begins. Particularly powerful for structured reasoning tasks and JSON output workflows.Set your system prompt once in the Qubrid Playground and it applies across every turn of the conversation.
๐ฏ Few-Shot Examples
Guide the modelโs reasoning style and output format with concrete examples โ especially effective for complex structured tasks.| User Input | Assistant Response |
|---|---|
What is the derivative of xยณ + 2xยฒ - 5x + 1? | Step 1: Apply power rule to each term. d/dx(xยณ) = 3xยฒ, d/dx(2xยฒ) = 4x, d/dx(-5x) = -5, d/dx(1) = 0. Result: 3xยฒ + 4x - 5 |
Debug this Python function: def add(a, b): return a - b | Bug found: The operator is subtraction (-) but the function name implies addition. Fix: return a + b |
๐ก Stack multiple few-shot examples in the Qubrid Playground to guide reasoning depth, chain-of-thought format, and output structure โ no fine-tuning required.
Inference Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| Streaming | boolean | true | Enable streaming responses for real-time output |
| Temperature | number | 0.6 | Recommended range 0.5โ0.7 to prevent endless repetitions |
| Max Tokens | number | 16384 | Maximum number of tokens to generate |
| Top P | number | 0.95 | Nucleus sampling: considers tokens with top_p probability mass |
Use Cases
- Advanced mathematical reasoning
- Code generation and debugging
- Complex multi-step problem solving
- Research and analysis
- JSON-structured output generation
- Function calling and tool use
Strengths & Limitations
| Strengths | Limitations |
|---|---|
| 671B total / 37B active MoE โ frontier reasoning at high efficiency | 128K max context (shorter than some competitors) |
| 87.5% on AIME 2025 โ up +17.5% from previous version | Requires very large infrastructure for self-hosting |
| Supports JSON output and function calling | Temperature must stay in 0.5โ0.7 range |
| Reduced hallucinations vs prior R1 | Reasoning traces increase total output length |
| Fully open-source with MIT license | |
| Chain-of-thought reasoning with visible traces |
Why Qubrid AI?
- ๐ No infrastructure setup โ 671B MoE served serverlessly, pay only for what you use
- ๐ OpenAI-compatible โ drop-in replacement using the same SDK, just swap the base URL
- ๐ฐ Cached input pricing โ $0.28/1M for cached tokens, dramatically reducing costs on repeated context
- ๐งช Built-in Playground โ prototype with system prompts and few-shot examples instantly at platform.qubrid.com
- ๐ Full observability โ API logs and usage tracking built into the Qubrid dashboard
- ๐ Multi-language support โ Python, JavaScript, Go, cURL out of the box
Resources
| Resource | Link |
|---|---|
| ๐ Qubrid Docs | docs.platform.qubrid.com |
| ๐ฎ Playground | Try DeepSeek R1-0528 live |
| ๐ API Keys | Get your API Key |
| ๐ค Hugging Face | deepseek-ai/DeepSeek-R1-0528 |
| ๐ฌ Discord | Join the Qubrid Community |
Built with โค๏ธ by Qubrid AI
Frontier models. Serverless infrastructure. Zero friction.
Frontier models. Serverless infrastructure. Zero friction.